Pandas DataFrame


In [2]:
lc_data = pd.DataFrame.from_csv('./lc_dataframe(cleaning).csv')
lc_data = lc_data.reset_index()
lc_data.tail()


Out[2]:
loan_amnt term int_rate installment grade sub_grade emp_title emp_length home_ownership annual_inc ... dti delinq_2yrs inq_last_6mths open_acc pub_rec revol_bal revol_util total_acc initial_list_status acc_now_delinq
268131 31050 60 21.99 857.40 6 61 1 10 1 875000.0 ... 9.66 1 0 10 0 25770 79.3 13 0 0
268132 10800 36 7.89 337.89 1 15 1 8 1 92400.0 ... 19.62 1 0 11 0 9760 68.7 36 1 0
268133 9000 36 9.17 286.92 2 22 1 1 1 80000.0 ... 3.97 1 0 8 0 6320 51.8 17 0 0
268134 14400 60 25.99 431.06 6 65 0 11 5 62000.0 ... 16.88 0 1 9 1 5677 45.1 30 0 0
268135 8000 36 12.59 267.98 3 32 1 4 4 45000.0 ... 26.21 0 0 12 0 9097 50.8 47 1 0

5 rows × 25 columns

V4 grade (범주형 데이터형)

LC assigned loan grade A,B,C,D,E,F,G = {1, 2, 3, 4, 5, 6, 7}


In [28]:
x = lc_data['grade']
sns.distplot(x, color = 'r')
plt.show()


V5 sub_grade (범주형 데이터형)

LC assigned loan subgrade 1, 2, 3, 4, 5


In [30]:
x = lc_data['sub_grade']
sns.distplot(x, color = 'g')
plt.show()


V6 emp_title (범주형 데이터형)

The job title supplied by the Borrower when applying for the loan.* True = 1, False = 0


In [32]:
x = lc_data['emp_title']
plt.hist(x)
plt.show()


V7 emp_length (범주형 데이터형)

Employment length in years. Possible values are between 0 and 10 where 0 means less than one year and 10 means ten or more years. < 1' = 0, 10+ = 10, 'n/a' = 11


In [33]:
x = lc_data['emp_length']

sns.distplot(x, color = 'r')
plt.show()


V8 home_ownership (범주형 데이터형)

The home ownership status provided by the borrower during registration or obtained from the credit report. Our values are: RENT, OWN, MORTGAGE, OTHER

Mortgage, None, Other, Own, Rent = {1, 2, 3, 4, 5}


In [34]:
x = lc_data['home_ownership']

sns.distplot(x, color = 'g')
plt.show()


V10 verification_status (범주형 데이터형)

Indicates if income was verified by LC, not verified, or if the income source was verified

Source Verified, Verified = 1, Not Verified = 0


In [35]:
x = lc_data['verification_status']

sns.distplot(x)
plt.show()


V11 issue_d (범주형 데이터형)

The month which the loan was funded

mm 으로 변환


In [36]:
x = lc_data['issue_d']
sns.distplot(x, color = 'r')
plt.show()


V14 purpose (범주형 데이터형)

A category provided by the borrower for the loan request.

15개 범주 = {1:15} (1부터 15까지)


In [37]:
x = lc_data['purpose']

sns.distplot(x, color = 'g')
plt.show()


V23 initial_list_status (범주형 데이터형)

The initial listing status of the loan. Possible values are – W, F

W = 1, F = 0


In [39]:
x = lc_data['initial_list_status']
plt.hist(x)
plt.show()